Title:
Domesday Book Place Text Descriptions England and Wales


Description: 
A modern Domesday Book for England and Wales created using open data, Python and frontier LLMs. Created to go alongside a study submitted for publication to an academic journal.


Author:
George Breckenridge, The University of Edinburgh, 2025


File-by-file description/purpose breakdown(s):
1. "2510_fig_1_initial_bua_scale_green_highlight_mapping.ipynb": Python Jupyter Notebook programming file written to produce a green-highlighted map of Built-Up Areas (ONS) in England and Wales, as used for Figure 1 in the academic paper.
2. "2510_pydeck_arc_visualisation_gpt4o_mini_1major_gpt_point_to_bua_centroid_distance_comparisons.html": HTML results file visualising major BUA results for LLM spatial error using a pydeck map.
3. "2510_pydeck_arc_visualisation_gpt4o_mini_2large_gpt_point_to_bua_centroid_distance_comparisons.html": HTML results file visualising large BUA results for LLM spatial error using a pydeck map.
4. "2510_pydeck_arc_visualisation_gpt4o_mini_3medium_gpt_point_to_bua_centroid_distance_comparisons.html": HTML results file visualising medium BUA results for LLM spatial error using a pydeck map.
5. "2510_pydeck_arc_visualisation_gpt4o_mini_4small_gpt_point_to_bua_centroid_distance_comparisons.html": HTML results file visualising small BUA results for LLM spatial error using a pydeck map.
6. "2510_pydeck_arc_visualisation_gpt4o_mini_5minor_gpt_point_to_bua_centroid_distance_comparisons.html": HTML results file visualising minor BUA results for LLM spatial error using a pydeck map.
7. "2512_fig_2_3_domesday_code_generate_place_summary_text.ipynb": Python Jupyter Notebook programming file written to produce open-data-enriched written place summaries for (essentially) all BUAs in England and Wales, as used in Figures 2 and 3 in the academic paper.
8. "2512_fig_3_4_domesday_code_geographic_ai_bias.ipynb": Python Jupyter Notebook programming file written to produce an analysis of the lat/lng toponymic knowledge of a frontier LLM from OpenAI. Data behind files 2-6 in this repository, plus Figures 3 and 4 in the academic paper.
9. "2512_full_output_data_geographic_bias_task.tab": Full results data in CSV-like file format showing differences between LLM perceived and actual lat/lng locations for (essentially) all BUAs in England and Wales.
10. "Domesday Book Text Results Full.docx": A word document with the results from file 7 on 31 Dec 2025. Over 1000 A4 pages of arranged content.


Installation Guidance:

Using Python 3.11.7 in a Microsoft Visual Studio Code jupyter notebook instance, the following installation environment is recommended:

Package                   Version
------------------------- -----------
affine                    2.4.0
aiohappyeyeballs          2.6.1
aiohttp                   3.12.13
aiosignal                 1.3.2
annotated-types           0.7.0
anyio                     4.10.0
asttokens                 3.0.0
attrs                     25.3.0
branca                    0.8.1
certifi                   2025.4.26
charset-normalizer        3.4.2
click                     8.2.1
click-plugins             1.1.1
cligj                     0.7.2
colorama                  0.4.6
comm                      0.2.2
contourpy                 1.3.2
cycler                    0.12.1
debugpy                   1.8.14
decorator                 5.2.1
distro                    1.9.0
et_xmlfile                2.0.0
executing                 2.2.0
fastjsonschema            2.21.2
filelock                  3.18.0
folium                    0.19.6
fonttools                 4.58.0
frozenlist                1.7.0
fsspec                    2025.5.0
geopandas                 1.0.1
h11                       0.16.0
httpcore                  1.0.9
httpx                     0.28.1
idna                      3.10
ipykernel                 6.29.5
ipympl                    0.9.7
ipython                   9.2.0
ipython_pygments_lexers   1.1.1
ipywidgets                8.1.7
jedi                      0.19.2
Jinja2                    3.1.6
jiter                     0.10.0
joblib                    1.5.0
jsonschema                4.24.0
jsonschema-specifications 2025.4.1
jupyter_client            8.6.3
jupyter_core              5.7.2
jupyterlab_widgets        3.0.15
kiwisolver                1.4.8
lightning-utilities       0.14.3
lxml                      6.0.1
mapclassify               2.9.0
MarkupSafe                3.0.2
matplotlib                3.10.3
matplotlib-inline         0.1.7
more-itertools            10.7.0
mpmath                    1.3.0
msgpack                   1.1.1
multidict                 6.5.0
nbformat                  5.10.4
nest-asyncio              1.6.0
networkx                  3.4.2
numpy                     2.2.6
openai                    1.102.0
openpyxl                  3.1.5
osmnx                     2.0.3
packaging                 25.0
pandas                    2.2.3
parso                     0.8.4
pillow                    11.2.1
pip                       23.2.1
planetary-computer        1.0.0
platformdirs              4.3.8
pooch                     1.8.2
prompt_toolkit            3.0.51
propcache                 0.3.2
psutil                    7.0.0
pure_eval                 0.2.3
pydantic                  2.11.7
pydantic_core             2.33.2
pydeck                    0.9.1
Pygments                  2.19.1
pyogrio                   0.11.0
pyparsing                 3.2.3
pypiwin32                 223
pyproj                    3.7.1
pystac                    1.13.0
pystac-client             0.8.6
python-dateutil           2.9.0.post0
python-docx               1.2.0
python-dotenv             1.1.0
pytz                      2025.2
pyvista                   0.45.2
pywin32                   310
PyYAML                    6.0.2
pyzmq                     26.4.0
rasterio                  1.4.3
referencing               0.36.2
requests                  2.32.3
rioxarray                 0.19.0
rpds-py                   0.25.1
SciencePlots              2.2.0
scikit-learn              1.6.1
scipy                     1.15.3
scooby                    0.10.1
seaborn                   0.13.2
setuptools                65.5.0
shapely                   2.1.1
six                       1.17.0
sniffio                   1.3.1
stack-data                0.6.3
sympy                     1.14.0
threadpoolctl             3.6.0
torch                     2.7.0
torchaudio                2.7.0
torchmetrics              1.7.2
torchvision               0.22.0
tornado                   6.5.1
tqdm                      4.67.1
traitlets                 5.14.3
trame                     3.10.2
trame-client              3.9.1
trame-common              1.0.0
trame-server              3.4.2
trimesh                   4.6.12
typing_extensions         4.13.2
typing-inspection         0.4.1
tzdata                    2025.2
urllib3                   2.4.0
vtk                       9.4.2
wcwidth                   0.2.13
widgetsnbextension        4.0.14
wslink                    2.3.4
xarray                    2025.6.1
xyzservices               2025.4.0
yarl                      1.20.1


Licence (Harvard Dataverse):
CC0 1.0 Universal CC0 1.0 Deed Canonical URL: https://creativecommons.org/publicdomain/zero/1.0/